Method and apparatus for fault tolerant flash upgrading

ABSTRACT

A novel method for upgrading a first program sequence in a computer system such that the computer system remains operable even if the upgrade process results in an incorrectly stored program sequence. The method uses the steps of storing the second program sequence in a second region of a memory, determining whether the second program sequence is stored correctly, and enabling the second program sequence if it is stored correctly. The first program sequence remains enabled if the second program sequence is not stored correctly.

BACKGROUND OF THE INVENTION

[0001] 1. Field of the Invention

[0002] The present invention relates to the field of computer systems; more particularly, the present invention relates to a method and apparatus for performing fault tolerant Flash electrically erasable programmable read-only memory (EEPROM) upgrading.

[0003] 2. Description of Related Art

[0004] Embedded microcontrollers are increasingly used in computer systems. This is especially true for mobile computers. Microcontrollers are used for keyboard control, pointing device control, battery management, power plane control, thermal management, switch debouncing and management, and system management interfacing, for example.

[0005] When one of the devices that interacts with the microcontroller is upgraded, the firmware code that handles that interaction often needs to be upgraded. In addition, upgrades are often required when a bug is discovered in the firmware code or a work-around is required to avoid a bug in one of the devices of the computer system. Since the microcontroller typically interacts with so many elements of the computer system, including the operating system, pointing devices and battery, upgrades of the firmware code can be frequent.

[0006] Upgrades to the firmware code can be accomplished a variety of ways. For example, upgrades can be performed by providing socketed parts that are typically replaced by a service provider. Alternatively, upgrades can be performed using downloadable RAM codestores that are expensive and have high power consumption.

[0007] However, the most cost-effective and convenient method is the use of Flash electrically erasable read-only memories (EEPROMs) or other Flash-based devices (e.g., a microcontroller with a Flash memory) to store the firmware code.

[0008] The use of Flash-based devices allow computer manufacturers to upgrade their computer systems using applications or basic input output system (BIOS) setup utilities that download new firmware code to a Flash memory. During the download operation, the old firmware code is erased and the new firmware code is written. A problem with this method is that the firmware code may be erased or corrupted if an error should occur during the download operation or the download operation is aborted prematurely. As a result, the computer may be rendered inoperable until it is returned to the computer manufacturer for expensive servicing.

[0009] Many techniques are employed to reduce the probability of erasure or corruption of the firmware code. Before beginning the download operation, the system verifies that there is sufficient power. A boot disk supplied by the computer manufacturer is used to ensure stable and known operating system conditions during the download operation. The power switch, reset button, and other state-changing switches are disabled to ensure continuous power and stable system conditions during the download operation. In addition, the system management interrupts and other system interrupts are disabled to reduce or eliminate interruptions of the download operation.

[0010] However, these techniques reduce the probability of erasure or corruption of the firmware code, but do not eliminate it. Despite all these precautions, portions of the firmware code can be corrupted. In addition, a disruption of power, for example, may cause the download operation to be prematurely terminated.

[0011] Therefore, it is desirable to provide a fault-tolerant upgrade process to upgrade firmware code such that the computer is still operable even if the upgrade results in erasure or corruption of the firmware code.

SUMMARY OF THE INVENTION

[0012] A fault-tolerant method and apparatus for performing a program upgrade. A first code sequence is stored in a first region of a memory and is enabled. An upgrade of the first code sequence is stored in a second region of the memory. A check is performed to determine whether the upgrade of the first code sequence is stored successfully. If the upgrade is stored successfully, the upgrade of the first code sequence is enabled. If the upgrade is not stored successfully, the first code sequence remains enabled, thereby maintaining operable code. The device therefore remains operable as the first code sequence remains enabled. Subsequent attempts at storing the upgrade of the first code sequence in the second region can then be performed until the upgrade is stored successfully.

BRIEF DESCRIPTION OF THE DRAWINGS

[0013]FIG. 1 illustrates one embodiment of a memory of the present invention.

[0014]FIG. 2 illustrates one embodiment of a memory structure of the present invention.

[0015]FIG. 3 illustrates one embodiment of a system of the present invention.

[0016]FIG. 4 illustrates one embodiment of a method for upgrading firmware code.

[0017]FIG. 5 illustrates one embodiment of a method for selecting the firmware code to use and enabling the firmware code.

DETAILED DESCRIPTION

[0018] The present invention uses a method for upgrading a first code sequence in a computer system such that the computer system is operable even if the upgrade process results in an incorrectly stored upgrade of the first code sequence. During the operation of the computer system, a first code sequence is stored in a first region of a memory and is enabled such that the computer system uses the first code sequence to operate. When an upgrade is to be performed, the upgrade of the first code sequence (a second code sequence) is stored in an unused region of the memory (the second region). During a reset, it is then determined whether the first and second regions contain successfully stored code sequences. If both the first and second code sequences are stored successfully, the code sequence with the more recent revision identifier (the second code sequence) is enabled. If the second code sequence is not stored correctly, the first code sequence is undisturbed and is enabled so that the computer system remains operable using the first code sequence. Subsequent attempts to upgrade the first code sequence may be made until the second code sequence is stored correctly and enabled.

[0019] An upgrade of the second code sequence (a third code sequence) is stored in an the first region since the currently used region is the second region. During a reset, it is then determined whether the first and second regions contain successfully stored code sequences. If both the second and third code sequences are stored successfully, the code sequence with the more recent revision identifier (the third code sequence) is enabled. If the third code sequence is not stored correctly, the second code sequence is undisturbed and is enabled so that the computer system remains operable using the second code sequence. Subsequent attempts to upgrade the second code sequence may be made until the third code sequence is stored correctly and enabled.

[0020]FIG. 1 illustrates a memory 100 of the present invention. The memory 100 has a region 101 which contains an interrupt vector table, an interrupt redirection table, and a cold reset handler. The memory 100 also has a first region 102 in which a first code sequence is stored and a second region 103 in which a second code sequence is stored. The first code sequence and the second code sequence are different revisions of the same code.

[0021] The cold reset handler is a code sequence used to implement the method of identifying which of the first and second code sequence is to be used and enabling that code sequence as described below.

[0022] In one embodiment, the memory 100 is a Flash memory and the region 101, the first region 102, and the second region 103 correspond to independently write-protectable regions within the Flash memory. For example, the Flash memory may be configured to allow a block-erase operation and write operations within the first region 102 while preventing these operations to the second region 103 and vice-versa. In another embodiment, the region 101 is contained in a read-only memory (ROM) and the first region 102 and the second region 103 are contained in independently write-protectable regions with the Flash memory. It will be apparent to one skilled in the art that other nonvolatile memory technologies can be used. In one embodiment, the first region 102 and the second region 103 are not write-protectable. In another embodiment, the write-protection of the first region 102 and the second region 103 cannot be independently set. In other words, both regions are either write-protected or both regions are not write-protected. In still another embodiment, the first region 102 and the second region 103 each consist of non-contiguous blocks of memory.

[0023]FIG. 2 illustrates the interrelation of an interrupt vector table 201, an interrupt redirection table 202, a random access memory (RAM) vector address table 203, and the first and second code sequences. The contents of the RAM vector address table 203 can be stored in RAM rather than nonvolatile memory because the contents are computed after the cold reset handler determines which of the first and second code sequences is to be enabled using the method described below. Therefore, the contents of the RAM vector address table 203 does not need to be maintained during power down. Each time the system is reset after power up, a determination of which region is to be enabled is made and the contents of the RAM vector address table 203 are computed accordingly.

[0024] The interrupt vector table 201 contains vectors corresponding to various interrupt types. Each of the vectors points to a corresponding indirect jump instruction in the interrupt redirection table 202. Each indirect jump references a corresponding address indicated in a RAM vector address table 203. Each of the addresses in the RAM vector address table correspond to an address in the corresponding interrupt handler within either the first region 102 or the second region 103, depending on whether the first or second code sequence is enabled.

[0025] For example, if an interrupt accesses the vector corresponding to IRQ0 in the interrupt vector table 201, the interrupt begins processing code at the IRQ0 address in the interrupt redirection table 202. The indirect jump at the IRQ0 address jumps to an address indicated in a corresponding location in the RAM vector address table 203. In one embodiment, the base address of the first region is 0100h (where h indicates the value is in hexadecimal), the base address of the second region is 1000h, and the offset of the IRQ0 handler is 0208h. When the address in the RAM vector address table 203 contains 0308h, the interrupt uses the interrupt handler in the first region (0100h+0208h). When the address in the RAM vector address table 203 contains 1208h, the interrupt uses the interrupt handle, in the second region (1000h+0208h).

[0026]FIG. 3 illustrates a system of the present invention. The system comprises the memory 100 and a RAM 300 coupled to a controller 310 through a bus 330 and an input device 320 coupled to the controller through a bus 340. The memory is configured as described above. The RAM 300 is used to store the RAM vector address table as described above. The controller 310 writes and reads to the memory 100 and the RAM 300 through the bus 330 to implement the methods of the present invention. The controller accesses the input device 320 through the bus 340 to receive the code sequences to be stored in the memory 100.

[0027] It will be apparent to one skilled in the art that the controller 310 may be any device capable of controlling the upgrade of the memory 100 according to the methods of the present invention. For example, the controller 310 may be a microcontroller which is programmed to perform the methods of the present invention. Alternatively, the controller 310 may be a microprocessor capable of executing a program contained in the RAM 300, for example, to perform the methods of the present invention.

[0028] It will be apparent to one skilled in the art that the input device 320 may be any device that is capable of receiving the upgrade code sequence and providing it to the controller 310. For example, the input device may be a floppy disk subsystem for reading a floppy disk containing the upgrade code sequence. Alternatively, the input device may be a tape drive system for reading a tape containing the upgrade code sequence.

[0029] The use of an interrupt redirection table increases the latency of interrupts by the amount of time required to execute the indirect jump. For example, the increased latency would be 0.6 microseconds for a microcontroller running at 10 Mhz and using an indirect jump instruction requiring 6 clock cycles. Other methods of performing an indirect jump may be used. In one embodiment, each element in the interrupt redirection table contains a code sequence that loads a corresponding address from the RAM vector address table into a register, pushes that address onto a stack, and transfers program control to the address on the stack. For example, the increased latency of performing an indirect jump using the address on the stack would be 3.2 microseconds on a microcontroller running at 10 Mhz and using an indirect jump sequence requiring 32 clock cycles. The increased interrupt latency is insignificant for most applications. It will be apparent to one skilled in the art that the interrupt latency will depend on specific factors such as the indirect jump sequence used, the operating frequency of the controller that executes that indirect jump sequence, and the latency of the memory.

[0030]FIG. 4 illustrates a method for storing an upgrade to a code sequence in the memory 100. In one embodiment, the program which implements the upgrade method (upgrade program) is contained on the storage media containing the upgrade version of the code sequence. It will be apparent to one skilled in the art that the upgrade program may be stored in other nonvolatile memory for access during the upgrade process.

[0031] At step 400, the upgrade program determines whether the code sequence stored in the first region 102 (the first code sequence) or the code sequence stored in the second region 103 (the second code sequence) is currently in use to operate the computer. In one embodiment, the region in use is identified by accessing a memory containing a region-in-use identifier. In another embodiment, the region in use is identified by accessing addresses (or portions of addresses indicating the base address value) within the RAM vector address table and determining which region is pointed to by these addresses. In still another embodiment, the technique used to select the region in use during a cold reset (described below) are used to determine which region is currently in use.

[0032] At step 420, the upgrade program selects the region that contains the code sequence that is not currently in use. Should the upgrade fail, the currently used version is undisturbed in the unselected region and available for use.

[0033] In steps 410 and 420, the upgrade program determines which region is in use and then selects another region which is therefore not in use. In an alternative embodiment, the upgrade program selects regions arbitrarily or through some selection algorithm, for example, until it identifies a selected region that is not in use.

[0034] A. At step 430, the upgrade program writes to the selected region with the new version of the code sequence.

[0035] Each time the computer system is reset during a power up (cold reset), the code sequences in the first region 102 and the second region 103 are evaluated to determine which is the latest correctly stored version and the latest correctly stored version is enabled.

[0036]FIG. 5 illustrates the method of enabling the latest version of the code sequence. In one embodiment, these steps are controlled by the cold reset handler. However, it will be apparent to one skilled in the art that the software sequence may be performed using a warm reset should the upgrade process be performed without powering down the computer system.

[0037] At step 500, the cold reset handler determines whether the code sequence in the first region 102 (the first code sequence) is stored correctly. In one embodiment, this is accomplished by computing a checksum of the code sequence according to well-known methods. In another embodiment, this may be accomplished by checking the parity of each element of the code sequence according to well-known methods. It will be apparent to one skilled in the art that other methods may be employed to check the integrity of the first code sequence and that multiple methods may be employed.

[0038] At step 510, the cold reset handler determines whether the code sequence in the second region 103 (the second code sequence) is stored correctly using the methods described above.

[0039] At step 520, the cold reset handler selects the code sequence that has a more recent revision identifier, if both the first and second code sequence are stored correctly. Two correctly stored code sequences are typically found when the previous version has been successfully upgraded.

[0040] At step 530, the cold reset handler selects the code sequence that is stored correctly, if only one of the first and second code sequence is stored correctly. One correctly stored code sequence is typically found when the upgrade process failed to complete successfully or when the computer system has never been upgraded. By selecting the correctly stored previous version rather than the incorrectly stored upgrade version, the system remains operable.

[0041] At step 540, the cold reset handler enables the selected code sequence. In one embodiment, the selected code sequence is enabled by storing the interrupt vectors corresponding to the selected code sequence in the RAM vector address table. These interrupt vectors are computed by adding the base address of the region containing the selected code sequence to the corresponding offsets associated with each interrupt. In one embodiment, these offsets are fixed values which are independent of the version of the code sequence. In another embodiment, these offsets may be determined by other means, such as retrieving these values from a portion of the selected code sequence. It will be apparent to one skilled in the art that any method that selectively directs execution to the selected code sequence as opposed to the other code sequences may be used to enable the selected code sequence.

[0042] The present invention increases the delay for a reset. This delay is largely due to the time required to determine whether each of the two regions contains correctly stored code. For example, the increased delay for a reset is approximately 26 milliseconds when each loop of a code sequence that sums the word elements of the two 16K byte regions takes 16 clock cycles on a microcontroller that operates at 10 Mhz. Alternatively, the increased delay for a reset is approximately 46 milliseconds when each loop of a code sequence that sums the byte elements of the two 16K byte regions takes 14 clock cycles on a microcontroller that operates at 10 Mhz. This is not an appreciable increased delay for a reset for most applications. It will be apparent to one skilled in the art that the increased reset delay will depend on specific factors such as the checksum code sequence used, the operating frequency of the controller that executes that checksum code sequence, and the latency of the memory.

[0043] Other embodiments of the present invention may be implemented. Whereas many alterations and modifications of the present invention will no doubt become apparent to a person of ordinary skill in the art after having read the foregoing description, it is to be understood that the particular embodiments shown and described by way of illustration are in no way intended to be considered limiting. Therefore, references to details of the preferred embodiment are not intended to limit the scope of the claims which in themselves recite only those features regarded as essential to the invention.

[0044] For example, a memory containing two regions which are capable of storing code sequences is described above. It will be apparent to one skilled in the art that the present invention may be practiced using more than two regions capable of storing code sequences. In addition, an indirect jump is used to selectively enable a code sequence in the description above. It will be apparent to one skilled in the art that other methods of selectively enabling the code sequences may be used. For example, the interrupt vector table itself may be modified to directly access the enabled code sequence. Alternatively, a direct jump instruction address may be modified to address the enabled code sequence.

[0045] Furthermore, revision identifiers are referenced to determine which of the successfully stored code sequences is the latest revision. In one embodiment, the upgrade code sequence modifies a portion of the enabled version of the code sequence such that the enabled version is no longer stored correctly after it determines that the upgrade version of the code sequence is stored successfully. The cold reset handler identifies the only successfully stored code sequence rather than compare revision identifiers to determine which code sequence to enable.

[0046] The enabled version may be modified such that it is no longer stored successfully but still able to operate correctly by modifying the checksum stored with the program, for example. During subsequent resets, the cold reset handler will determine that the previously enabled version of the code sequence is not stored correctly because the checksum no longer corresponds to the sum of the data elements of the enabled version of the code sequence. Since the rest of the code sequence (besides the checksum) is still correct, the previously enabled version of the code sequence continues to operate correctly after the upgrade program modifies the checksum. During subsequent resets, the cold reset handler will only find one correctly stored version of the code sequence, the upgrade version of the code sequence, and the upgrade version is enabled. 

What is claimed is:
 1. A method for upgrading a code sequence in a memory comprising the steps of: identifying a non-enabled region of said memory; and storing an upgrade version of said code sequence in said non-enabled region of said memory, said upgrade version of said code sequence being an upgrade of an enabled version of said code sequence in an enabled region of said memory.
 2. The method of claim 1 wherein the step of identifying said non-enabled region of said memory comprises the step of reading a region-in-use identifier.
 3. The method of claim 1 wherein the step of identifying said non-enabled region of said memory comprises the step of reading a portion of an interrupt redirection table.
 4. The method of claim 1 wherein the step of identifying said non-enabled region of said memory comprises the steps of: determining whether a first version of said code sequence having a first revision identifier is stored successfully in a first region; determining whether a second version of said code sequence having a second revision identifier is stored successfully in a second region; and if said first version of said code sequence and said second version of said code sequence are stored successfully: comparing said first revision identifier to said second revision identifier to determine whether said first version of said code sequence is more recent than said second version of said code sequence; and identifying said first region as said non-enabled region, if said second version of said code sequence is more recent than said first version of said code sequence; and identifying said second region as said non-enabled region, if said first version of said code sequence is more recent than said second version of said code sequence; and identifying said first region as said non-enabled region, if said first version of said code sequence is not stored successfully; and identifying said second region as said non-enabled region, if said second version of said code sequence is not stored successfully.
 5. The method of claim 1 comprising the steps of: determining whether said upgrade version of said code sequence is stored successfully; writing to a portion of said enabled version of said code sequence in said enabled region of said memory such that the enabled version of said code sequence is not stored successfully.
 6. The method of claim 1 further comprising the steps of: determining that said upgrade version of said code sequence is not stored successfully; storing said upgrade version of said code sequence in said non-enabled region of said memory.
 7. The method of claim 1 further comprising the steps of: determining that said upgrade version of said code sequence is not stored successfully; storing said upgrade version of said code sequence in a second non-enabled region of said memory.
 8. A method for enabling a code sequence in a memory, said method comprising the steps of: determining whether a first version of said code sequence is stored successfully in a first region of said memory; determining whether a second version of said code sequence is stored successfully in a second region of said memory; enabling said first version of said code sequence, if said second version of said code sequence is not stored successfully; and enabling said second version of said code sequence, if said first version of said code sequence is not stored successfully.
 9. The method of claim 8 wherein said first version of said code sequence has a first revision identifier and said second version of said code sequence has a second revision identifier further comprising the steps of: if said first version of said code sequence and said second version of said code sequence are stored successfully: comparing said first revision identifier to said second revision identifier to determine whether said first version of said code sequence is more recent than said second version of said code sequence; enabling said first version of said code sequence, if said first version of said code sequence is more recent than said second version of said code sequence; and enabling said second version of said code sequence, if said second version of said code sequence is more recent than said first version of said code sequence.
 10. The method of claim 8 wherein the step of determining whether said first version of said code sequence is stored successfully comprises the steps of: computing a checksum for said first version of said code sequence; and determining whether said checksum is correct.
 11. The method of claim 8 wherein the step of determining whether said second version of said code sequence is stored successfully comprises the steps of: computing a checksum for said second version of said code sequence; and determining whether said checksum is correct.
 12. The method of claim 8 wherein each of said step of enabling said first version of said code sequence comprises the step of directing interrupts to said first region.
 13. The method of claim 12 wherein said memory comprises a third region comprising a first table and a second table, wherein said first table comprises interrupt vectors pointing to a corresponding code sequence in said second table, each of said corresponding code sequences comprising an indirect jump instruction capable of accessing said first and second regions depending on a corresponding value in a third table in a second memory, said step of directing interrupts to said first region comprising the step of updating said third table.
 13. The method of claim 8 wherein each of said steps of enabling said second version of said code sequence comprise the step of directing interrupts to said second region.
 14. The method of claim 13 wherein said memory comprises a third region comprising a first table and a second table, wherein said first table comprises interrupt vectors pointing to a corresponding code sequence in said second table, each of said corresponding code sequences comprising an indirect jump instruction capable of accessing said first and second regions depending on a corresponding value in a third table in a second memory, said step of directing interrupts to said second region comprising the step of updating said third table.
 15. A memory comprising: a first region having a first code sequence having a first revision identifier; a second region having a second code sequence having a second revision identifier, said second code sequence being an upgrade of said first code sequence; and a third region comprising a first table and a second table, wherein said first table comprises interrupt vectors pointing to a corresponding code sequence in said second table, each of said corresponding code sequences pointing to a corresponding vector of said second region.
 16. The memory of claim 15 wherein said first region and said second region is independently write-protectable.
 17. The memory of claim 15 wherein said third region is independently write-protectable.
 18. The memory of claim 15 wherein said first region and said second region is a FLASH memory.
 19. The memory of claim 15 wherein said third region is a FLASH memory.
 20. The memory of claim 15 wherein said third region is a Read-Only Memory (ROM).
 21. A controller for upgrading a code sequence in a memory, said controller comprising: logic for identifying a non-enabled region of said memory; logic for storing an upgrade version of said code sequence in a non-enabled region of said memory, said upgrade version of said code sequence being an upgrade of an enabled version of said code sequence in an enabled region of said memory.
 22. A controller for enabling a code sequence in a memory, said controller comprising: logic for determining whether a first code sequence having a first revision identifier is stored successfully in a first region of said memory; logic for determining whether a second code sequence having a second revision identifier is stored successfully in a second region of said memory; logic for enabling said first code sequence, if said second code sequence is not stored successfully; and logic for enabling said second code sequence, if said first code sequence is not stored successfully.
 23. The controller of claim 22 further comprising: logic for determining whether said first code sequence is more recent than said second code sequence; logic for enabling said first code sequence, if said first code sequence is more recent than said second code sequence; and logic for enabling said second code sequence, if said second code sequence is more recent than said first code sequence. 